Linguistic scope-based and biological event-based speculation and negation annotations in the BioScope and Genia Event corpora

نویسندگان

  • Veronika Vincze
  • György Szarvas
  • György Móra
  • Tomoko Ohta
  • Richárd Farkas
چکیده

BACKGROUND The treatment of negation and hedging in natural language processing has received much interest recently, especially in the biomedical domain. However, open access corpora annotated for negation and/or speculation are hardly available for training and testing applications, and even if they are, they sometimes follow different design principles. In this paper, the annotation principles of the two largest corpora containing annotation for negation and speculation - BioScope and Genia Event - are compared. BioScope marks linguistic cues and their scopes for negation and hedging while in Genia biological events are marked for uncertainty and/or negation. RESULTS Differences among the annotations of the two corpora are thematically categorized and the frequency of each category is estimated. We found that the largest amount of differences is due to the issue that scopes - which cover text spans - deal with the key events and each argument (including events within events) of these events is under the scope as well. In contrast, Genia deals with the modality of events within events independently. CONCLUSIONS The analysis of multiple layers of annotation (linguistic scopes and biological events) showed that the detection of negation/hedge keywords and their scopes can contribute to determining the modality of key events (denoted by the main predicate). On the other hand, for the detection of the negation and speculation status of events within events, additional syntax-based rules investigating the dependency path between the modality cue and the event cue have to be employed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linguistic scope-based and biological event-based speculation and negation annotations in the Genia Event and BioScope corpora

Background: The treatment of negation and hedging in natural language processing has received much interest recently, especially in the biomedical domain. However, open access corpora annotated for negation and/or speculation are hardly available for training and testing applications, and even if they are, they sometimes follow different design principles. In this paper, the annotation principl...

متن کامل

Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes

Detecting the linguistic scope of negated and speculated information in text is an important Information Extraction task. This paper presents ScopeFinder, a linguistically motivated rule-based system for the detection of negation and speculation scopes. The system rule set consists of lexico-syntactic patterns automatically extracted from a corpus annotated with negation/speculation cues and th...

متن کامل

The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts

This article reports on a corpus annotation project that has produced a freely available resource for research on handling negation and uncertainty in biomedical texts (we call this corpus the BioScope corpus). The corpus consists of three parts, namely medical free texts, biological full papers and biological scientific abstracts. The dataset contains annotations at the token level for negativ...

متن کامل

Speculation and Negation Scope Detection via Convolutional Neural Networks

Speculation and negation are important information to identify text factuality. In this paper, we propose a Convolutional Neural Network (CNN)-based model with probabilistic weighted average pooling to address speculation and negation scope detection. In particular, our CNN-based model extracts those meaningful features from various syntactic paths between the cues and the candidate tokens in b...

متن کامل

Bridging the Gap Between Scope-based and Event-based Negation/Speculation Annotations: A Bridge Not Too Far

We study two approaches to the marking of extra-propositional aspects of statements in text: the task-independent cue-and-scope representation considered in the CoNLL-2010 Shared Task, and the tagged-event representation applied in several recent event extraction tasks. Building on shared task resources and the analyses from state-of-the-art systems representing the two broad lines of research,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2011